Reinforcement Learning with Heterogeneous Policy Representations

نویسندگان

  • Petar Kormushev
  • Darwin G. Caldwell
چکیده

In Reinforcement Learning (RL) the goal is to find a policy π that maximizes the expected future return, calculated based on a scalar reward function R(·) ∈ R. The policy π determines what actions will be performed by the RL agent. Traditionally, the RL problem is formulated in terms of a Markov Decision Process (MDP) or a Partially Observable MDP (POMDP). In this formulation, the policy π is viewed as a mapping function (π : s 7−→ a) from state s ∈ S to action a ∈ A. This approach, however, suffers severely from the curse of dimensionality.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical Functional Concepts for Knowledge Transfer among Reinforcement Learning Agents

This article introduces the notions of functional space and concept as a way of knowledge representation and abstraction for Reinforcement Learning agents. These definitions are used as a tool of knowledge transfer among agents. The agents are assumed to be heterogeneous; they have different state spaces but share a same dynamic, reward and action space. In other words, the agents are assumed t...

متن کامل

Transferring Expectations in Model-based Reinforcement Learning

We study how to automatically select and adapt multiple abstractions or representations of the world to support model-based reinforcement learning. We address the challenges of transfer learning in heterogeneous environments with varying tasks. We present an efficient, online framework that, through a sequence of tasks, learns a set of relevant representations to be used in future tasks. Withou...

متن کامل

Reinforcement Learning in Robotics: Applications and Real-World Challenges

In robotics, the ultimate goal of reinforcement learning is to endow robots with the ability to learn, improve, adapt and reproduce tasks with dynamically changing constraints based on exploration and autonomous learning. We give a summary of the state-of-the-art of reinforcement learning in the context of robotics, in terms of both algorithms and policy representations. Numerous challenges fac...

متن کامل

Nonparametric Bayesian Policy Priors for Reinforcement Learning

We consider reinforcement learning in partially observable domains where the agent can query an expert for demonstrations. Our nonparametric Bayesian approach combines model knowledge, inferred from expert information and independent exploration, with policy knowledge inferred from expert trajectories. We introduce priors that bias the agent towards models with both simple representations and s...

متن کامل

Budgeted Knowledge Transfer for State-Wise Heterogeneous RL Agents

In this paper we introduce a budgeted knowledge transfer algorithm for non-homogeneous reinforcement learning agents. Here the source and the target agents are completely identical except in their state representations. The algorithm uses functional space (Q-value space) as the transfer-learning media. In this method, the target agent's functional points (Q-values) are estimated in an automatic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013